KEC@DPIL-FIRE2016: Detection of Paraphrases in Indian Languages (Tamil)

نویسندگان

  • R. Thangarajan
  • S. V. Kogilavani
  • A. Karthic
  • S. Jawahar
چکیده

This paper presents a report on Detecting Paraphrases in Indian Languages (DPIL), in particular the Tamil language, by the team NLP@KEC of Kongu Engineering College. Automatic paraphrase detection is an intellectual task which has immense applications like plagiarism detection, new event detection, etc. Paraphrase is defined as the expression of a given fact in more than one way by means of different phrases. Paraphrase identification is a classic natural language processing task which is of classification type. Though there are several algorithms for paraphrase identification, reflecting the semantic relations between the constituent parts of a sentence plays a very important role. In this paper we utilize sixteen different features to best represent the similarity between sentences. The proposed approach utilizes machine learning algorithms like Support Vector Machine and Maximum Entropy for classification of given sentence pair. They have been classified into Paraphrase and Not-a-Paraphrase for task1 and Paraphrase, Not-a-Paraphrase and Semi-Paraphrase for task2. The accuracy and performance of these methods are measured on the basis of evaluation parameters like accuracy, precision, recall, f-measure and macro f-measure. Our methodology got 2 place in DPIL evaluation track.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

DPIL@FIRE2016: Overview of the Shared task on Detecting Paraphrases in Indian language

This paper explains the overview of the shared task "Detecting Paraphrases in Indian Languages" (DPIL) conducted at FIRE 2016. Given a pair of sentences in the same language, participants are asked to detect the semantic equivalence between the sentences. The shared task is proposed for four Indian languages namely Tamil, Malayalam, Hindi and Punjabi. The dataset created for the shared task has...

متن کامل

JU_NLP@DPIL-FIRE2016: Paraphrase Detection in Indian Languages - A Machine Learning Approach

This paper presents our system report on our participation in the shared task on “Detecting Paraphrases in Indian Languages (DPIL)” organized in the “Forum for Information Retrieval Evaluation (FIRE)”2016, in both the tasks (Task1 and Task2) defined in this shared task in four Indian languages (Tamil, Malayalam, Hindi and Punjabi). We made use of different similarity measures and machine transl...

متن کامل

KS_JU@DPIL-FIRE2016: Detecting Paraphrases in Indian Languages Using Multinomial Logistic Regression Model

In this work, we describe a system that detects paraphrases in Indian Languages as part of our participation in the shared Task on detecting paraphrases in Indian Languages (DPIL) organized by Forum for Information Retrieval Evaluation (FIRE) in 2016. Our paraphrase detection method uses a multinomial logistic regression model trained with a variety of features which are basically lexical and s...

متن کامل

NLP-NITMZ@DPIL-FIRE2016: Language Independent Paraphrases Detection

In this paper we describe the detailed information of NLP-NITMZ system on the participation of DPIL shared task at Forum for Information Retrieval Evaluation (FIRE 2016). The main aim of DPIL shared task is to detect paraphrases in Indian Languages. Paraphrase detection is an important part in the field of Information Retrieval, Document Summarization, Question Answering, Plagiarism Detection e...

متن کامل

HIT2016@DPIL-FIRE2016: Detecting Paraphrases in Indian Languages based on Gradient Tree Boosting

Detecting paraphrase is an important and challenging task. It can be used in paraphrases generation and extraction, machine translation, question and answer and plagiarism detection. Since the same meaning of a sentence is expressed in another sentence using different words, it makes the traditional methods based on lexical similarity ineffective. In this paper, we describe a strategy of Detect...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2016